Data
The dataset used for the competition consisted of features indicating amount of time spent on various pages of the site by the visitor, personal details of the visitor such as gender, marital status and education, and OS/search engine being used by the visitor.
Some of the features are:
-
HomePage: Number of times visited this page
-
HomePage_Duration: Total number of duration spent on this page.
-
GoogleMetric-Bounce Rate: Whenever a user comes to any one web-page of the website and he/she does not go to any other page and exits from the website from the same page, then this activity done by the user is called Bounce. And the percentage of the total number of times the user visiting our website and bounce it, is called Bounce Rate
-
SeasonalPurchase: It is a weight indicator to track the seasonal purchase. If a user makes a purchase during any seasonal time (Mother’s Day, Diwali, Valentine's Day), we will assign based on internal heuristic.
The other attributes like, OS, Search Engine, Zone, Type of Traffic, Customer Type, Gender, Cookies Setting, Education, Marital Status and Weekend Purchase are self-explanatory variables.
The train data had 14731 samples to be used for training and test data had 6599 samples to use for testing/evaluation.